查看原文
其他

在单个 RedHat 虚拟机中安装 Db2 开发版中的 pureScale 特性 | 周末送资料

点击蓝字关注👉 twt企业IT社区 2024-02-18

【作者】陆川,从事IT行业20多年,精通Db2和Informix数据库,擅长运维和高可用架构设计和实施;熟悉Sybase数据库;目前从事Db2的推广和Informix的升级和维护等工作。


1. 前言

本文阐述了在RedHat上安装Db2开发版中 pureScale的详细过程,描述了仅部署一个member和一个CF的过程,主要目的是让大家在单个虚机上可以安装pureScale,不需要很多的机器、存储和网络设备,帮助大家尽快的熟悉pureScale的安装和使用。

在本次安装中使用的是Db2开发版,这个版本可以从下面的网址获得,在安装结束后,不需要注册许可证,不存在90天过期的问题。

https://www.ibm.com/us-en/marketplace/ibm-db2-direct-and-developer-editions


2.虚机的内存大小

建议配置为6GB左右,作者发现配置为4.5GB时,后面在执行activate database时,CF会报申请内存错误。


3.受支持的操作系统

在RedHat上安装,受支持的操作系统版本是6.7版本到7.4版本;
如果你手上有10.5的版本,受支持的RedHat版本是5.9到6.5;但10.5版本没有开发版,在安装结束后需要注册许可证。


4.在虚机中增加磁盘

在虚拟机的VM/setting中选择Disk,然后点击Add按钮增加一个3GB的磁盘,这个磁盘将作为将来purescale的实例共享目录。如下图所示:

在虚机启动之后,执行fdisk -l你将看到/dev/sdb这块盘,purescale的实例将来会存放在这块盘对应的GPFS文件系统中。

[root@node01 Desktop]# fdisk -l
Disk /dev/sdb: 3221 MB, 3221225472 bytes
255 heads, 63 sectors/track, 391 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


5.上传安装包

sftp root@192.168.1.6

put v11.1_linuxx64_dec.tar.gz
然后解压安装包
cd /root/db2
gzip -d v11.1_linuxx64_dec.tar.gz
tar xvf v11.1_linuxx64_dec.tar


6.进行安装之前的准备工作

1)修改操作系统内核参数

编辑/etc/sysctl.conf,增加下列内容
kernel.shmmni=1024
kernel.shmmax=4294967296
kernel.shmall=2097152
kernel.sem=250 256000 32 1024
kernel.msgmni=4096
kernel.msgmax=65536
kernel.msgmnb=65536
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.swappiness = 5
vm.overcommit_memory = 0
执行sysctl -p 后这些参数马上生效,不需要重启实例。

2)增加创建实例所需要的用户和组

groupadd --gid 1001 db2iadm1
groupadd --gid 1002 db2fadm1

useradd --uid 1004 -g db2iadm1 -m -d /home/db2sdin1 db2sdin1
useradd --uid 1003 -g db2fadm1 -m -d /home/db2sdfe1 db2sdfe1

passwd db2sdin1
passwd db2sdfe1

3)安装和配置open ssh

(1)修改下面的文件,去掉注解:

File: /etc/ssh/ssh_config 
Port 22
Protocol 2,1

File: /etc/ssh/sshd_config
PermitRootLogin yes
PasswordAuthentication no

(2)设置root和db2sdin1用户基于公钥的身份验证

[root@node01 ~]# pwd
/root
[root@node01 ~]# cd .ssh 
[root@node01 ~]# mkdir .ssh 
[root@node01 ~]# ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
SHA256:nxIn2PWF/w+3ANlLrhLlibu+cTbMW2a7cgbRvHeasfA root@node01
The key's randomart image is:
+---[DSA 1024]----+
| |
| . |
| . + . |
| o . +o= |
| . S =o+oo |
| Bo==.oo.|
| ..=*.Xo=+|
| ++.BoEoo|
| .++o+o...|
+----[SHA256]-----+

[root@node01 ~]# cd .ssh 
[root@node01 .ssh]# cat id_dsa.pub >> authorized_keys
[root@node01 .ssh]# chmod 644 authorized_keys

[root@node01 .ssh]# ssh node01 hostname
The authenticity of host 'node01 (fe80::c460:f21f:70bf:afdb%ens33)' can't be established.
ECDSA key fingerprint is SHA256:TnZKcs1N8mOUb4g+Bbx2azWbApf/1ZBe5ILEkbgq7yw.
ECDSA key fingerprint is MD5:9f:b8:49:f3:80:d8:d7:e6:53:49:84:29:b9:1a:8c:46.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node01,fe80::c460:f21f:70bf:afdb%ens33' (ECDSA) to the list of known hosts.
node01
[root@node01 .ssh]# ssh node01 hostname
node01
最后一步不再需要密码,说明配置无密码登陆成功。
以同样的步骤设置db2sdin1用户的基于公钥的身份验证。

4)配置本地yum源

挂载 
mount -o loop /root/RHEL-6.7-20150702.0-Server-x86_64-dvd1.iso /mnt/cdrom

备份/etc/yum.repos.d/下的文件:

mkdir /etc/yum.repos.d/backup

mv /etc/yum.repos.d/rh* /etc/yum.repos.d/backup/

在/etc/yum.repos.d/下新建repo文件,repo的文件名随便命名,但必须要以.repo结尾,并配置如下内容:

vi /etc/yum.repos.d/RHEL-ISO.repo

在vi下,将RHEL-ISO.repo中写下如下内容:

[base]
name=iso
baseurl=file:///mnt/cdrom
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release

清除缓存:
yum clean all
测试是否配置成功:
yum install httpd

5)参照下面的网址安装所需要的操作系统包

https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.qb.server.doc/doc/r0057441.html

作者发现,缺省安装的操作系统缺少下面的一些包:

yum groupinstall 'Infiniband Support'
yum install gcc 
yum install cpp
yum install gcc-c++
yum install kernel-devel
yum install ksh
yum install ntp
yum install sg3_utils
yum install libstdc++.so.6
yum install pam32*
yum install libcxgb*
yum install m4
yum install binutils-devel
yum install patch

在redhat 6.7的版本中安装,Db2 purescale并不需要操作系统包 perl-Sys-Syslog,但作者发现在7.4版本的redhat中需要这个包,否则TSA的安装会失败。

6)修改主机名

修改 /etc/sysconfig/network 文件.

打开这文件进行编辑.

HOSTNAME=node01

修改之后你需要重启网络

/etc/init.d/network restart

修改 /etc/hosts

192.168.1.6 node01 node01

但作者发现需要reboot之后,hostname才会生效。

7)关闭selinux

在RHEL系统上,如果启用了安全性增强型Linux(SELinux)并且处于强制模式,则安装程序可能会因SELinux限制而失败。

要确定是否已安装SELinux并处于强制模式,您可以使用以下步骤之一:

1、检查/etc / sysconfig/selinux文件。

2、运行sestatus命令。

查看selinux状态

[root@node01 lib]# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Max kernel policy version: 28

要禁用SELinux,您可以使用以下方法:

1、将其设置为许可模式并以超级用户身份运行setenforce 0命令。

2、修改/etc /sysconfig/selinux并重启机器

[root@node01 server_dec]# sestatus
SELinux status: disabled

8)运行安装包中的db2prereqcheck确认每一步都成功

./db2prereqcheck -v 11.1.4.4 -p -o /tmp/report3.out

Checking prerequisites for DB2 installation. Version "11.1.4.4". Operating system "Linux"

Validating "Linux distribution " ... 
Required minimum operating system distribution: "RHEL"; Version: "6"; Service pack: "7". 
Actual operating system distribution Version: "6"; Service pack: "7". 
Requirement matched.

Validating "kernel level " ... 
Required minimum operating system kernel level: "2.6.16". 
Actual operating system kernel level: "2.6.32". 
Requirement matched.

Validating "C++ Library version " ... 
Required minimum C++ library: "libstdc++.so.6" 
Standard C++ library is located in the following directory: "/usr/lib64/libstdc++.so.6.0.13". 
Actual C++ library: "CXXABI_1.3.1" 
Requirement matched.

Validating "32 bit version of "libstdc++.so.6" " ... 
Found the 32 bit "/usr/lib/libstdc++.so.6" in the following directory "/usr/lib". 
Requirement matched.

Validating "libaio.so version " ... 
DBT3553I The db2prereqcheck utility successfully loaded the libaio.so.1 file. 
Requirement matched.

Validating "libnuma.so version " ... 
DBT3610I The db2prereqcheck utility successfully loaded the libnuma.so.1 file. 
Requirement matched.

Validating "/lib/libpam.so*" ... 
Requirement matched. 
DBT3533I The db2prereqcheck utility has confirmed that all installation prerequisites were met.

作者后来在redhat7.4安装的过程中发现,在操作系统包m4没有安装的情况下,db2prereqcheck能够通过,但创建实例的时候会报错;因此建议严格按照官网上的要求检查purescale的安装前置条件,否则在创建实例时就会失败。
需要注意的是db2sdin1目录下的.ssh目录的权限为700,否则db2prereqcheck运行会有报错。

9)修改文件/var/ct/cfg/netmon.cf

增加下面一行内容
!REQD eth0 192.168.1.6
其他的内容不要修改
这个文件用于TSA监控各个节点的网络状态


7.安装Db2产品

执行./db2_install
./db2_install
Read the license agreement file in the db2/license directory.


To accept those terms, enter "yes". Otherwise, enter "no" to cancel the install process. [yes/no]
yes

Default directory for installation of products - /opt/ibm/db2/V11.1


Install into default directory (/opt/ibm/db2/V11.1) ? [yes/no] 
yes

Specify one of the following keywords to install DB2 products.

SERVER 
CLIENT 
RTCL

Enter "help" to redisplay product names.

Enter "quit" to exit.


SERVER


Do you want to install the DB2 pureScale Feature? [yes/no] 
yes
在安装结束后,检查/tmp下的日志文件,确认没有错误发生


8.创建实例共享目录

[root@node01 instance]# ./db2cluster_prepare -instance_shared_dev /dev/sdb -instance_shared_mount /db2sd -t /tmp/sd_prepare.trc -l /tmp/sd_prepare.log

/db2sd就是后面db2icrt所使用的实例共享目录。
DBI1446I The db2cluster_prepare command is running.

DB2 installation is being initialized.

Total number of tasks to be performed: 1 
Total estimated time for all tasks to be performed: 60 second(s)

Task #1 start
Description: Creating IBM General Parallel File System (GPFS) Cluster and Filesystem 
Estimated time 60 second(s) 
Task #1 end

The execution completed successfully.

For more information see the DB2 installation log at "/tmp/sd_prepare.log".
DBI1070I Program db2cluster_prepare completed successfully.

说明:在实际的生产或者测试环境中,是不需要使用db2cluster_prepare 去创建GPFS的,db2icrt会自动创建GPFS作为实例共享目录


9.创建实例

./db2icrt -m node01 -mnet node01 -cf node01 -cfnet node01 -instance_shared_dir /db2sd -tbdev node01 -u db2sdfe1 -d db2sdin1

这里使用IP作为tiebreak disk;

在创建结束之后,请检查/tmp目录下的db2icrt的有关日志文件,确认创建实例的过程中没有报错。


10.查看许可证

$db2licm -l

Product name: "IBM DB2 Developer-C Edition"
License type: "Community"
Expiry date: "Permanent"
Product identifier: "db2dec"
Version information: "11.1"
Max amount of memory (GB): "16"
Max number of cores: "4"
Max amount of table space (GB): "100"

发现许可是永久有效的,后面3行是开发版的限制。


11.创建数据库

在虚机中安装使用的是TCPIP网络,不是infiniband和万兆的ROCE卡,因此需要设置

DB2_SD_ALLOW_SLOW_NETWORK注册变量;

[db2sdin1@node01 ~]$ db2set -lr | grep -i sd
DB2_SD_ALLOW_SLOW_NETWORK
[db2sdin1@node01 ~]$ db2set DB2_SD_ALLOW_SLOW_NETWORK=on
[db2sdin1@node01 ~]$ db2 terminate
DB20000I The TERMINATE command completed successfully.
[db2sdin1@node01 ~]$ db2start 
db2sdin1@node01 ~]$ db2 create db psdb 
DB20000I The CREATE DATABASE command completed successfully.
[db2sdin1@node01 ~]$ db2 activate db psdb 
DB20000I The ACTIVATE DATABASE command completed successfully.


12.查看集群实例状态

[db2sdin1@node01 ~]$ db2instance -list


13.在安装中碰到的问题

1)在设置db2sdin1用户的基于公钥的身份验证之后,发现db2reqreqcheck运行不能通过,报下面的错误:

The db2prereqcheck tool detected Interface Adapter "node01" is an Ethernet adapter configured for Sockets. Configure it in dat.conf if it is RDMA capable on host machine named "node01".
The db2prereqcheck tool detected Interface Adapter "node01" is an Ethernet adapter configured for Sockets. Configure it in dat.conf if it is RDMA capable on host machine named "node01".
Validating "passwordless ssh" ...
DBT3567E The db2prereqcheck utility found that db2locssh is not configured and passwordless SSH is not enabled between host "node01" and host "node01".
ERROR : Requirement not matched.

Validating "PING TEST" ...
DBT3572W The db2prereqcheck utility found that netname "node01" is not pingable from host "node01".
DBT3572W The db2prereqcheck utility found that netname "node01" is not pingable from host "node01".
WARNING : Requirement not matched.

DBT3572W The db2prereqcheck utility found that netname "node01" is not pingable from host "node01".

解决办法

发现是.ssh目录的权限不对.

drwxr-xr-x 2 db2sdin1 db2iadm1 4096 May 9 07:26 .ssh

.ssh目录要求的权限是700

[db2sdin1@node01 ~]$ chmod 700 .ssh

2)在执行db2icrt时在日志中报下面的错误

这个问题是作者在redhad7.4上发现的。

ERROR: A reachable IP address could not be automatically determined that did
not belong to one of the hosts in the DB2 pureScale instance. There may be a
problem with the network adapters gateway IP address or with the hosts network
connection. Verify connectivity for the hosts and manually edit the
configuration file /var/ct/cfg/netmon.cf on each host to include an IP on the
network outside of the DB2 pureScale instance that can be reached by the ping
command so that DB2 may ensure network connectivity. Hosts: "node01 ". The
format of /var/ct/cfg/netmon.cf lines is as follows: !REQD eth1 9.26.123.245

虚拟出来的网卡名称为ens33,不是eth1之类的名称。

[root@node01 tmp]# ifconfig -a
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.5 netmask 255.255.255.0 broadcast 192.168.1.255

root@node01 tmp]# cat /var/ct/cfg/netmon.cf
!REQD ens33 192.168.1.5

不清楚为什么purescale不认ens33这样的名称。

这个问题有个解决办法是这支下面的环境变量,在创建实例时,不进行检查

export SKIP_NETMON_VALIDATION=YES

3)在激化数据库时报SQL2049N的错误

db2 activate db psdb 
SQL2049N Database activation failed because there is insufficient CF memory. 
Reason code = "1".

通过执行 db2 ? SQL2049N可以获得错误的描述信息

解决办法:

db2 update dbm cfg using numdb 1

然后将虚机的内存从4.5GB增加到6GB;

4)在redhat7.4上安装时,执行db2_install的过程中,没法安装GPFS包gpfs.ext-4.2.3-0.x86_64.rpm

解决办法:

以debug方式运行installGPFS,

cd server_dec/db2/linuxamd64/gpfs
./installGPFS -a -f -d > /tmp/gpfs.debug.out 2>&1

打开/tmp/gpfs.debug.out文件,发现有下面的错误:

+ rpm -i gpfs.ext-4.2.3-0.x86_64.rpm
error: Failed dependencies:
m4 is needed by gpfs.ext-4.2.3-0.x86_64

然后执行yum list m4并没有找到这个包

运行下面两个命令之一就可以安装m4操作系统包。

yum install m4

或者

yum groupinstall 'Infiniband Support'

5)因为警告造成db2start报错,无法正常启动实例

$ db2start
05/14/2019 19:18:48 0 0 SQL6036N START or STOP DATABASE MANAGER command is already in progress.
SQL1032N No start database manager command was issued. SQLSTATE=57019

df -h显示GPFS已经正常启动

$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel-root 17G 16G 1.8G 90% /
devtmpfs 2.8G 0 2.8G 0% /dev
tmpfs 2.8G 4.0K 2.8G 1% /dev/shm
tmpfs 2.8G 9.0M 2.8G 1% /run
tmpfs 2.8G 0 2.8G 0% /sys/fs/cgroup
/dev/sda1 1014M 179M 836M 18% /boot
tmpfs 568M 24K 568M 1% /run/user/0
db2fs1 3.0G 1.1G 2.0G 36% /db2sd
[db2sdin1@node01 ~]$ db2instance -list

显示有警告

ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME
-- ---- ----- --------- ------------ ----- ---------------- ------------ -------
0 MEMBER ERROR node01 node01 YES 0 0 node01
128 CF STOPPED node01 node01 NO - 0 node01

HOSTNAME STATE INSTANCE_STOPPED ALERT
-------- ----- ---------------- -----
node01 ACTIVE NO YES
There is currently an alert for members, CFs, hosts, cluster file system or cluster configuration in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -list -alert'.

[db2sdin1@node01 ~]$ db2cluster -list -alert 
1.
Alert: DB2 member '0' failed to start on its home host 'node01'. The cluster manager will attempt to restart the DB2 member in restart light mode on another host. Check the db2diag.log for messages concerning failures on host 'node01' for member '0'.

Action: Check the member db2diag log files for messages about member failures and the cluster caching facility cfdiag log files for messages about CF failures on the host. If there are alerts about network adapters not responding, this alert cannot be cleared manually. It will be cleared when a network adapter becomes available. If it is not a problem with network adapters, this alert needs to be manually cleared after other alerts are handled. To clear this alert run the following command: 'db2cluster -cm -clear -alert -member 0'. For more information, see the 'Troubleshooting options for the db2cluster command' topic in the DB2 Information Center.

Impact: DB2 member '0' will not be able to service requests until this alert has been cleared and the DB2 member returns to its home host.

[db2sdin1@node01 ~]$ db2cluster -cm -clear -alert -member 0
The alerts have been successfully cleared.
[db2sdin1@node01 ~]$ db2cluster -list -alert 
There are no alerts
[db2sdin1@node01 ~]$ db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORTNETNAME
-- ---- ----- --------- ------------ ----- ---------------- -------------------
0 MEMBER ** RESTARTING** node01 node01 NO 0 0 node01
128 CF STOPPED node01 node01 NO - 0 node01

HOSTNAME STATE INSTANCE_STOPPED ALERT
-------- ----- ---------------- -----
node01 ACTIVE NO NO
[db2sdin1@node01 ~]$ db2start
05/14/2019 19:24:17 0 0 SQL1026N The database manager is already active.
SQL1026N The database manager is already active.
[db2sdin1@node01 ~]$ db2pd -

Database Member 0 -- Active -- Up 0 days 00:01:12 -- Date 2019-05-14-19.24.24.887219
[db2sdin1@node01 ~]$ db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORTNETNAME
-- ---- ----- --------- ------------ ----- ---------------- -------------------
0 MEMBER STARTED node01 node01 NO 0 0 node01
128 CF PRIMARY node01 node01 NO - 0 node01

HOSTNAME STATE INSTANCE_STOPPED ALERT
-------- ----- ---------------- -----
node01 ACTIVE NO NO


如有任何问题,可点击文末阅读原文到社区原文下评论交流


 资料/文章推荐:

  • 在单个 SuSe Linux虚拟机中安装Db2 pureScale

  • http://www.talkwithtrend.com/Article/244125


欢迎关注社区 “数据库集群”技术主题,将会不断更新优质资料、文章。地址:

http://www.talkwithtrend.com/Topic/36527


下载 twt 社区客户端 APP

与更多同行在一起

高手随时解答你的疑难问题

轻松订阅各领域技术主题

浏览下载最新文章资料


长按识别二维码即可下载

或到应用商店搜索“twt”


长按二维码关注公众号

*本公众号所发布内容仅代表作者观点,不代表社区立场

继续滑动看下一个

在单个 RedHat 虚拟机中安装 Db2 开发版中的 pureScale 特性 | 周末送资料

点击蓝字关注👉 twt企业IT社区
向上滑动看下一个

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存